Protocol to identify defined reprogramming factor expression using a factor-indexing single-nuclei multiome sequencing approach

Summary Ectopic expression of lineage-specific transcription factors (TFs) of another cell type can induce cell fate reprogramming. However, the heterogeneity of reprogramming cells has been a challenge for data interpretation and model evaluation. Here, we present a protocol to characterize cells expressing defined factors during direct cell reprogramming using a factor-indexing approach based on single-nuclei multiome sequencing (FI-snMultiome-seq). We describe the steps for barcoding TFs, converting human fibroblasts to pancreatic ductal-like cells using defined TFs, and preparing library for FI-snMultiome-seq analysis. For complete details on the use and execution of this protocol, please refer to Fei et al.1


Institutional permissions
Experiments on live vertebrates or higher invertebrates must be performed in accordance with national guidelines and regulations.Experiments involving lentivirus must be conducted under Biosafety Level 2 (BSL2) as per institutional guidelines.We remind readers to obtain all the necessary permissions from the relevant institutions before starting the experiment.

Identify candidate TFs for your reprogramming model
Timing: variable 1. Select candidate TFs based on their reported role in developmental biology and/or use computational framework, such as Mogrify (https://mogrify.net/index), 2 to predict candidate TFs required for transcriptomic switches from your source cell type to any target cell type.
Note: Computational frameworks based on gene expression data and regulatory network information can usually predict the TFs required for cell fate conversion for broad tissue types instead of specific cell types. 2,3You may combine the literature-curated TFs and computational framework-predicted TFs as initial pool to start with.

Clone TF ORFs into Gateway donor vectors to generate entry clones
Timing: 1-2 weeks 2. Obtain the full-length ORFs of all individual TFs on Gateway donor vectors from publicly available resources such as Addgene or get the attB flanked TF ORFs from any commercial source and clone into Gateway donor vector (e.g., pDNOR221) by BP Reaction following manufacturer's instructions.Note: 103 Multiome Gel Beads include a poly(dT) sequence that enables capture of 3 0 polyadenylated mRNA for gene expression (GEX) library.It cannot detect transcripts originating from exogeneous vectors with >1 kb distance between polyA and the ORF.This barcoding PCR reaction will introduce a 20-bp random oligo (N20) with NheI site into pLenti6/V5-DEST vector downstream of the ORF and 78 bp upstream of its 3 0 LTR region, which contains the polyA.Inserting TF barcodes close to polyA enables their efficient capture and optimal library size for Illumina sequencing.One barcoding PCR reaction will generate a complex pool of barcoded vectors that can be used to barcode as many TFs as needed and the excess PCR product can be kept for future use.Designing individual barcode primers for barcoding PCR to avoid screening uniquely barcoded vectors is also feasible when working with less than three TFs.

KEY RESOURCES
CRITICAL: Prepare 4-6 reactions to obtain enough PCR product after clean-up and to maintain its complexity.
b. Gently mix the reaction by pipetting and centrifuge briefly.Start PCR using the cycling conditions described in the table below.
c. Load PCR product on a 0.7% agarose gel and run gel at 120 V for 1 h.Expected PCR product, linear pLenti6/V5-DEST-Barcode vector with Nhel site at both ends, is 8,724 bp (Figure 1A).Gel-purify the PCR product following manufacturer's instruction.Troubleshooting 1.
Note: DNA fragments may exhibit a slight shift of towards higher molecular weight on agarose gel before clean-up (Figure 1A).
Pause point: Store purified product at À20 C for long-term storage or proceed to the next step.
2. Restriction digest of linear pLenti6/V5-DEST-Barcode vectors to create sticky ends, self-circularization and transformation: a. Prepare following reaction components and mix gently.Spin down quickly and incubate at 37 C for 1 h.

Reprogramming of pancreatic ductal-like cells from human fibroblasts
Timing: 3-6 weeks, depends on the desired time points for investigation This section describes how to induce pancreatic cells from human fibroblasts through lentiviral expression of defined TFs.

Lentivirus production in 293FT cells:
Culture 293FT cells as per manufacturer's instruction.
CRITICAL: Follow biosafety level 2 (BSL2) guidelines while working with lentivirus and be careful with the storage and disposal of biohazard waste.
a. Poly-L-lysine coating.Note: Viral titers were determined using Perkin Elmer p24 ELISA Kit that represents physical titer based on the concentration of p24 protein: 1 pg/mL of p24 z 10 4 lentiviral particles/ mL z 100 TU/mL, when considering each lentiviral particle contains 2,000 molecules of p24.Of note, physical titer includes free p24 and defective viral particles in addition to the infectious viral particles.To be more accurate, we also performed the functional tittering (TU/mL) by transducing HFFs with pLenti6/V5-EGFP virus, followed by fluorescence tittering protocol for lentivirus from Addgene.For the same pLenti6/V5-EGFP virus preparation, the functional titer approximately corresponds to 1% of the physical titer from p24 ELISA assay.Based on this comparison, the physical titers of other lentiviruses were c.Remove unbound material and rinse twice using DMEM.
Pause point: If not using the plate immediately, add 250 mL of DMEM per well and keep the plate in cell culture incubator for up to 2 days.Do not dry the plate after matrigel coating.

Direct reprogramming of HFFs to pancreatic ductal-like cells:
a. Day 0: Seed early-passage HFFs at 20,000 cells per well in matrigel-coated 24-well plate.
Note: Scale the cell number based on the surface area if using other cell cultureware.Use the cells before passage 8.
Note: Although the cells are transduced with a pool of six TFs, cells carrying any sub-combinations of TFs from the 6F pool can also be present.This will result in the heterogeneity of reprogramming cells.Since ectopic TF-expressing cells can be identified based on their barcodes, the non-transduced cells without any barcodes can either be used as control cells or excluded from the analysis.Optionally, non-transduced cells can be eliminated during the experiment by a seven-day blasticidin selection at 5 mg/mL.The MOI used for each of the transduced TFs was determined based on their expression levels in ductal cells from adult human 11 together with the earlier test results. 1 c.Day 2: Change to fresh fibroblast medium.d.Day 3: Change to basal medium for reprogramming cells supplemented with 100 ng/mL Activin A, 1 mM CHIR99021 and 50 ng/mL FGF7.e. Day 5: Change to basal medium for reprogramming cells supplemented with 100 ng/mL Activin A and 50 ng/mL FGF7.f.Day 7: Change to basal medium for reprogramming cells supplemented with 2 mM retinoic acid, 500 nM PD0325901 and 200 nM LDN193189.g.Day 8: Change to basal medium for reprogramming cells supplemented with 100 ng/mL Activin A and 500 nM PD0325901.
h.From day 10 onwards, culture cells in maintenance medium for reprogramming cells.Pancreatic ductal-like cells with epithelial cell morphology start to appear at around day 21 (Figure 1C).Troubleshooting 4.
Note: During cell reprogramming, passage cells at 1:2 ratio using Accutase when cells reach around 90% confluency and replate cells in fresh matrigel-coated plates.
Preparing library for FI-snMultiome-seq analysis

Timing: 3 days
This section describes library preparation for FI-snMultiome-seq.
8. Collect reprogramming cells at desired time points and isolate nuclei using demonstrated protocol CG000365 from 103 Genomics.Troubleshooting 5 and 6.
CRITICAL: Using good quality cells is critical for the success of the experiments.Remove dead cells following the demonstrated protocol from 103 Genomics if cell viability is less than 90%.For balanced representation of different conditions in the analysis, isolate nuclei from individual conditions, count number of nuclei from each condition and pool the desired number of nuclei from different conditions for single-nuclei capture.Avoid pooling cells from different conditions before nuclei isolation.

Note:
The required number of nuclei to be profiled per condition depends on the experimental setting, e.g., the number of transduced TFs.For high quality (Grade A) nuclei, 800 nuclei per TF is a good starting point when overexpressing one TF.If the quality of nuclei is less optimal (Grade B), a minimum of 1,000 nuclei is recommend for a single-TF condition.Please refer to 103 Genomics instructions to assess the nuclei quality.9. Prepare libraries for ATAC and GEX following user guide CG000338 from 103 Genomics.10.Barcode library preparation for detecting TF barcodes: a. Prepare following reaction components on ice.Mix thoroughly and centrifuge briefly.
Note: Pre-amplified sample is the purified product after step 4.3p from user guide CG000338.
b. Start PCR using the cycling conditions described in the table below.

Reagent Amount
Pre-amplified sample 3 mL   2C).h.Perform library quantification for sequencer clustering using KAPA Library Quantification Kit following manufacturer's instructions and determine the concentration based on fragment size derived from TapeStation trace.i.Sequence ATAC and GEX libraries following user guide CG000338.Sequence barcode library using the same sequencing parameters for GEX library at a minimum of 2,000 read pairs per nucleus.

KAPA
Optional: Determine fragment size using Bioanalyzer or LabChip.

EXPECTED OUTCOMES
The FI-snMultiome-seq protocol enables efficient capture of the TF expression from exogeneous expression vectors using a factor-indexing approach, which significantly improves the current 103 single cell multiome assay.This allows the identification and characterization of the cells that express all stochastic combinations of transduced TFs during direct reprogramming by reading the factor barcodes.Thus, it provides robust molecular analysis for comparing various reprogramming conditions and assessing the effect of individual TF in one experiment.Non-transduced cells can either be used as control or excluded from analysis to reduce the noise in the data.Notably, our methodology also enables segregating the expression of TFs from exogeneous vectors and endogenous genes by separately counting the transcripts from factor barcodes (exogeneous) and mRNA (endogenous).This allows the investigation of the effect of transduced TFs on its endogenous expression.Therefore, the application of FI-snMultiome-seq opens a path to study TF-mediated direct reprogramming at single-cell resolution, providing a comprehensive overview of the remodeling of transcriptomic and epigenomic landscape during transdifferentiation. Furthermore, the FI-snMultiome-seq protocol is versatile.The primers used for barcode insertion can easily be modified to work with other lentiviral constructs.Other than direct reprogramming, FI-snMultiome-seq can also be applied to other experimental conditions and cellular models that aim to study the cellular response to overexpression of genes from lentiviral constructs.

QUANTIFICATION AND STATISTICAL ANALYSIS
The data were processed and analyzed as described in Fei et al. 1 TF barcodes were extracted from the fastq files of the barcode library using the Sequence Input/Output interface (SeqIO) 4 from BioPython 5 and raw sequencing data from GEX and ATAC libraries were processed with Cell Ranger ARC pipeline (v2.0.1) for demultiplexing, alignment, and feature counting, followed by subsequent analysis using Seurat (v4.1.1) 6and Signac (v1.7.09). 7The number of reads for each TF barcode were counted and added as a column to the metadata of the Seurat object.Cells with <1,000 RNA unique molecular identifier (UMI) counts or ATAC fragments, >100,000 RNA UMI counts or >500,000 ATAC fragments, and >30% mitochondrial RNA were excluded.For scRNA-seq, the raw UMI counts were normalized and scaled using SCTransform, regressing out cell cycle effect and the effect of the percentage of the UMI counts originating from mitochondrial genes.The top 3,000 variable genes were selected for principal-component analysis (PCA).For scATAC-seq, peaks were called using MACS2 (v2.2.7.1). 8Peaks on nonstandard chromosomes and GRCh38 blacklist regions were excluded, followed by frequency inverse document frequency (TF-IDF) normalization and latent semantic indexing (LSI) reduction.The 1-35 PCs from the scRNA-seq data and the 2-35 LSI dimensions from the scATAC-seq data were used for constructing the weighted-nearest neighbor (WNN) graph.Percell motif activity scores were calculated using the Signac RunChromVAR wrapper. 9Cell type scores were computed using ScType (v1.0) 10 with markers for pancreatic cell types from the ScType database and fibroblast markers from literature. 12MITATIONS FI-snMultiome-seq is based on lentiviral gene delivery.As with lentiviral expression systems, there could be potential silencing of viral expression after a few passages.This limits its application in analyzing mature reprogramming cells at late stages.Besides, cells carrying any sub-combinations from the TF pool can be present in theory when transducing a pool of TFs.However, it may happen that only some of the sub-combinations are present and the cell number of some conditions can be small.Thus, careful experimental planning and including additional cells for the sub-combination(s) of interest is important for optimal performance of FI-snMultiome-seq.

TROUBLESHOOTING Problem 1
Low colony count after transformation (related to steps 1-2).

Potential solution
Confirm all the entry and destination clones by sequencing to check for the integrity of the BP and LR recombination sites.Confirm ampicillin concentration.Avoid multiple freeze-thaws of ligation product.Repeat transformation using a new fresh batch of competent cells.Purify the restriction digested product to remove contaminants and repeat the ligation.Purify the barcoding PCR product to remove contaminants.Repeat the restriction digest and ligation.

Problem 2
Incorrect recombination of barcoded lentiviral vectors (related to step 3).

Potential solution
Always use Stbl3 competent cells or others that are suitable for propagation of lentiviral constructs.
Grow the bacterial culture at lower temperature such as 30 C. Lower rotation speed to 200 rpm.Use the dual combination of ampicillin and chloramphenicol for selection.

Problem 3
Low yield of lentiviruses (related to step 5).

Potential solution
Use early-passage 293FT cells.

Figure 1 .
Figure 1.Barcoding PCR product and morphological changes of cells during reprogramming (A) Gel image of barcoding PCR product showing the expected product at 8,724 bp.(B) Gel images of restriction digested barcoded vector pLenti6/V5-DEST vector-1 (left) and the in silico digestion using NEBcutter (right).(C) Cell morphological changes during reprogramming.Scale bar represents 100 mm.
To ensure the libraries are minimally amplified, add SYBR Green I to PCR reaction master mix and run a qPCR test.Calculate the optimal cycle number for each sample by determining the number of cycles required to reach 1/3 of the maximum R. c. Cleanup with SPRIselect beads: i. Vortex the SPRIselect beads until fully resuspended.Add 40 mL (0.83) SPRIselect beads to each sample.Pipette mix 10 times.ii.Incubate 5 min at 20 C-25 C. iii.Centrifuge briefly.Place on the 103 magnetic separator (magnet$High) until the solution clears.iv.Remove the supernatant.v. Add 200 mL 80% ethanol to the pellet.Wait 30 s. Remove the ethanol.Repeat it.vi.Centrifuge briefly.Place on the magnet$Low.Remove any remaining ethanol.vii.Remove the tube strip from the magnet.Immediately add 40.5 mL Buffer EB. viii.Pipette mix 10 times and incubate 2 min at 20 C-25 C. ix.Centrifuge briefly.Place on the magnet$Low until the solution clears.x.Transfer 40 mL sample to a new tube strip.Pause point: Store at À20 C for long-term storage or proceed to the next step.d.Prepare following reaction components on ice.Mix thoroughly and centrifuge briefly.CRITICAL: Choose different indices for the samples in a multiplexed sequencing run.Note: Barcode libraries can be pooled in the same sequencing run for GEX libraries if the indices are compatible.e. Start PCR using the cycling conditions described in the table below.Note: To ensure the libraries are minimally amplified, add SYBR Green I to PCR reaction master mix and run a qPCR test.Calculate the optimal cycle number for each sample by determining the number of cycles required to reach 1/3 of the maximum R. f.Cleanup with SPRIselect beads: i. Vortex the SPRIselect beads until fully resuspended.Add 40 mL (0.83) SPRIselect beads to each sample.Pipette mix 10 times.ii.Incubate 5 min at 20 C-25 C.
iii.Centrifuge briefly.Place on the 103 magnetic separator (magnet$High) until the solution clears.iv.Remove the supernatant.v. Add 200 mL 80% ethanol to the pellet.Wait 30 s. Remove the ethanol.Repeat it.vi.Centrifuge briefly.Place on the magnet$Low.Remove any remaining ethanol.vii.Remove the tube strip from the magnet.Immediately add 30.5 mL Buffer EB. viii.Pipette mix and incubate 2 min at 20 C-25 C. ix.Centrifuge briefly.Place on the magnet$Low until the solution clears.x.Transfer 30 mL sample to a new tube strip.Pause point: Store the sample at À20 C for long term storage or proceed to the next step.g.Run 1 mL sample of ATAC, GEX and barcode libraries at 1:5 dilution on an Agilent TapeStation High Sensitivity D5000 ScreenTape to determine the average fragment size (Figures 2A-

TABLE REAGENT
Prepare the PCR reaction master mix on ice as described in the table below.
Note: Nhel creates sticky ends for self-circularization. Dpnl helps to remove circular template from barcoding PCR reaction in step 1.b.Gel-purify the restriction digested product following manufacturer's instruction.c.Prepare following reaction components and mix thoroughly.Spin down quickly and incubate at 16 C for 12 À 20 h.AflII and Xhol help to confirm that no rearrangement in the LTR regions of pLenti6/V5-DEST-Barcode vectors has taken place.Nhel helps to ensure the insertion of barcode. .Perform plasmid Midiprep for each barcoded factor following manufacturer's instruction.
d. Use up to 2 mL of the mixture for transformation of 50 mL ccdB Survival 2T1R competent cells.Troubleshooting 1.Pause point: Store ligated product at À20 C for up to 1 month or proceed to the next step.3.Screening of pLenti6/V5-DEST vectors carrying unique barcodes: a. Screen single colonies and purify plasmids following manufacturer's instruction.b.Verify pLenti6/V5-DEST-Barcode clones by restriction digest and obtain the barcode sequences of individual clones by Sanger sequencing.i.Prepare following reaction components and mix gently.Spin down quickly and incubate at 37 C for 1 h.Note: ii.Run restriction digested product on 0.7% agarose gel.Expected bands from correct clones are 3876 bp, 3656 bp, 960 bp and 220 bp (Figure 1B).iii.Obtain the barcode sequence for individual clones by Sanger sequencing using Sseq_Barcode primer (see key resources table).Troubleshooting 2. CRITICAL: Avoid using the barcodes that introduce new AflII, Xhol and Nhel sites.Protocol Cloning of individual TFs into barcoded Gateway destination vectors Timing: 4-5 days, depends on the number of TFs This section is aimed at cloning individual TFs into barcoded pLenti6/V5-DEST vectors and preparing the plasmids of barcoded TFs for lentivirus production.4.Gateway recombination cloning and plasmid purification: a. Clone the ORFs of individual TFs into barcoded pLenti6/V5-DEST vectors by LR reaction following manufacturer's instruction.CRITICAL: Use different barcoded pLenti6/V5-DEST vectors for individual TFs to ensure each TF is labeled with a unique barcode.b.Use 1 mL of LR reaction for transformation of 50 mL Stbl3 competent cells.c.Screen for positive clones by restriction digest as described in step 3b and further verify the sequences of ORF and barcodes by Sanger sequencing using Sseq_ORF and Sseq_Barcode primers (see key resources table), respectively.d